
\subsection{Visualizations}
\subsubsection{One-versus-one binary classification}
In order to understand how well our classifiers perform, we did some statistical analysis on the data.
In the beginning we did an one-versus-one comparison of the classes to figure out what 
are the most difficult cases to handle. The data that we got using Support Vector Machines
are summarized in figure \ref{figure:svm1vs1} where every the color of every cell (i,j) 
corresponds to the classification error when comparing between the class i with class j.
Darker colors imply higher classification errors.

\begin{figure}[!htb]
	\centering
	\includegraphics[width=0.6\textwidth]{figures/SVM1vs1.pdf}
	\caption{One-versus-one error rates}
	\label{figure:svm1vs1}
\end{figure}

One interesting but natural property is that the error rate is higher closer to the diagonal.
This was expected because the classes are ordered so that phonemes with similar phonetic
properties are close to each other and grouped into categories. For the same reason, one 
can also observe that the error rates are relatively higher within each category compared 
with different categories.

Through this procedure we were able to pin down the "hard to classify between" classes and
see how our classifiers treat those cases. To get a more detailed view on the difficulty of classification,
we projected the points and our classifiers in two dimensions, where we are able to get a grasp on what
is happening to check whether the classifiers make sense and match our intuition. In figure \ref{figure:planar}
we can see some projections in the plane of some easy (1vs42), medium(5vs12) and hard (23vs24) instances.

\begin{figure}[!htb]
	\centering
\subfloat[Easy: Class 1 vs Class 42]{
\includegraphics[width=0.3\textwidth]{figures/1v42.pdf}
}
\subfloat[Medium: Class 5 vs Class 12]{
\includegraphics[width=0.3\textwidth]{figures/5v12.pdf}
}
\subfloat[Hard: Class 23 vs Class 24]{
\includegraphics[width=0.3\textwidth]{figures/23v24.pdf}
}	
	\caption{Planar projections of points and classifiers}
	\label{figure:planar}
\end{figure}

The 2-component Gaussian Mixture Models for every class are depicted with green circles and the Support Vector Machines hyperplane is depicted with
a black vertical line going through zero. We can see that in the easy case the data are linearly separable while in the other cases several points are misclassified.

\subsubsection{Multi-class Classification}
For the case of classification over all classes, we can generate a similar table as we did in the previous section describing the confusion between classes. In figure \ref{figure:confusionall} the confusion matrices are shown for the cases of Support Vector Machines, Gaussian Mixture Models and Random Forests.

\begin{figure}[!htb]
	\centering
\subfloat[Support Vector Machines]{
\includegraphics[width=0.5\textwidth]{figures/cmSVM.pdf}
}
\subfloat[Gaussian Mixture Models]{
\includegraphics[width=0.5\textwidth]{figures/cmGMM.pdf}
}
\qquad
\subfloat[Random Forests]{
\includegraphics[width=0.5\textwidth]{figures/cmRF.pdf}
}	
	\caption{Confusion Matrices for different classifiers}
	\label{figure:confusionall}
\end{figure}

\subsection{SVM vs GMM vs RF}

Table \ref{table:error_summary} summarizes the results of the previous sections for the different classification techniques on the reduced set of 39 classes. Overall, all methods achieve very high classification accuracies with error rates lower than 30\%. We can see that the Gaussian mixture model classifer performs better than the other two, followed by Support Vector Machines. Taking into account the reduced training time for GMMs, we can conclude that its current popularity is well justified.

\begin{table}[!htb]
\centering
\begin{tabular}{|c|c|c|c|}
\hline
Method & Error Rate (Train) & Error Rate (Dev) & Error Rate (Test) \\
\hline
SVM & 22.38 \% & 24.49\% & 26.28\%  \\
\hline
GMM & 17.50\% & 23.83\% & 24.91\%  \\
\hline
RF & 0.00\% & 27.82\% & 28.16\%  \\
\hline
\end{tabular}
\caption{Error rates on the data sets for SVM, GMM, RF}
\label{table:error_summary}
\end{table}

However, the other methods have similar performance to GMMs and they perform better on several cases where GMMs fail to classify correctly. In the subsequent sections several attempts are made to combine the results of the different classifiers either through hybrid methods or commitees so that the overall classification accuracy is increased. The best classification accuracy one would hope to achieve is to always pick the best prediction out of the 3 classifiers. The ideal combined accuracies for the dev set are summarized in table \ref{table:combined_accuracy}.

\begin{table}[!htb]
\centering
\begin{tabular}{|c|c|c|c|c|}
\hline
 & GMM-SVM & GMM-RF & SVM-RF & GMM-SVM-RF \\
\hline
Error Rate & 14.45 \% & 15.95\% & 17.53\% & 12.15\%  \\
\hline
\end{tabular}
\caption{Combined error rates on the dev set}
\label{table:combined_accuracy}
\end{table}
